-
-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
speedup array work by using push instead of concat #185
Conversation
Which benchmarks are you using to determine "faster"? Due to the JIT in JS, if it's "intuition" or "theory", it's invariably a wrong assumption. Separately, since nobody has 10 million query string values, that's an irrelevant comparison. I'd need to see proof that it was actually faster on practical, real-world test cases, before wanting to use mutation, merely for performance. |
obj[key] = [].concat(obj[key]).concat(val); | ||
var cur = obj[key]; | ||
if (Array.isArray(cur)) { | ||
cur.push(val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's also worth noting that this behaves differently when val
is an array or not, because of the concat
behavior. Since the decoder can be user-provided, it becomes very important to retain that auto-spreading behavior (as well as the lack of mutation).
I spent time setting up performance testing runs and comparing four different implementations to see which ones work better. I didn't imagine it :) I also don't expect you to take my word for it. I figured you'd run some tests yourself to see. I can share some simple perf testing scripts if you'd like. I didn't get into the details in the PR because, basically, when contributing to a new project I'm never sure if the maintainer is still active or cares for PR's. :) So, I throw some in there to see how they respond. I spotted this section because the concat function is one of those things on my mental list to look out for because performance testing shows avoiding it is often faster. I think running the tests many times is relevant because it causes runtime optimizations to kick in and give more real results. A server will eventually run a while and parse millions of query strings. I tested the implementations by giving it the permutations:
The implementations I tested:
I left it the simpler way expecting I'd be happy to change the PR to use that implementation, if you're interested. And, I can provide some performance testing |
Thanks, I'm glad to hear you're being thorough :-) So, to reiterate, my concerns are:
One way to mitigate the code clarity concern is to move the current implementation into a helper function first - leaving the main function clean - and then optimize the performance of that helper function. In other words, if we stick Please do update the PR; it's exceedingly rare that a performance claim is actually worth optimizing for when evidence is requested, so it'd be fun to see this one bear out :-) |
I don’t consider speed the least consideration. If it’s very clear code but, takes forever, then it’s of no use. I agree about clarity. Code should be as readable as possible. JavaScript has a tough time with that. All the extra braces and parenthesis and stuff make it messy. Also, this code is lacking comments, so, I didn’t put my own in, otherwise I would document what it’s doing. I think comments are required for clarity. I added a few to the new version of this. I’m wary of mutation where appropriate. I’m not wholly opposed to its use. If the array came from outside the library then “defensive copying” would be a good idea at the start. In this case, the qs library is creating the array itself, using it repeatedly, and then providing it afterwards back to the user. So, it’s okay for it to mutate the arrays it creates for itself while processing the query string. I’m also a fan of moving code blocks out into their own functions. Monolithic functions are tough to deal with. I moved this one out to the already require()’d “utils”. I also altered its structure for greater readability. I’m providing the benchmark info via a separate PR; #187. That way it isn’t part of a code change PR. Below is an image of the results I get when running the benchmark. The column headers show the type of the two inputs given to the implementations. The first column is the row headers showing which implementation it is. The numbers are what's produced by BenchmarkJS which are the average number of operations during a run and the +- stuff is the variance of the result from run to run. |
|
||
// mutate `a` to append `b` and then return it | ||
if (firstIsArray) { | ||
secondIsArray ? a.push.apply(a, b) : a.push(b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably a good idea to merge in utils.combine
with concat
first into master, and then have the optimization PR be separate.
Current implementation creates "throw away" (intermediary) arrays with two
concat()
calls on an empty initial new array.It could instead be
obj[key] = [].concat(obj[key], val);
which puts both values in the same call toconcat()
. This is only marginally faster.The fastest way is using
push()
when it's an array and creating the new array with the values otherwise. When testing with 10 million iterations this way is an order of magnitude faster than the two above methods. It also avoids creating a lot of "throw away" arrays.Travic CI failure is due to new linting seeing unnecessary escapes. I fixed those in a new PR #184.